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SUMMARY 


The  Time  Warp  multiprocessing  scheme  promises  speedup  for  object-oriented 
discrete-event  simulations.  The  Concurrent  Processing  for  Advanced  Simulation 
project  has  constructed  a  Lisp-based  Time  Warp  system  for  implementing  sim¬ 
ulations  with  many  large,  complex  objects.  Since  many  objects  share  a  single 
processor,  the  CPU  time  allocated  to  each  object  must  be  scheduled.  Since 
object  events  are  not  preempted,  we  are  scheduling  which  objects  have  events 
processed  rather  CPU  time  per  object.  We  developed  approaches  to  schedul¬ 
ing,  ranging  from  a  simple  Round-Robin  mechanism  to  complex  ones  involving 
queue  length. 

We  developed  ten  different  scheduling  algorithms  which  we  named:  Worst 
Case,  Conventional  Round-Robin,  Lowest  Local  Virtual  Time  (LVT)  First,  Pri¬ 
ority  LVT,  Largest  Queue  Priority,  Bradford/Fitch,  Anti-Penalty,  Queue  Anti- 
Penalty,  Queue  Cycle,  and  Positive  Infinity. 

Results  show  that  LVT,  anJi- messages,  rollbacks,  returned  messages,  and 
anti-reminders  are  good  parameters  for  scheduling  of  system  resources.  Input 
queue  size  is  also  an  important  factor,  but  when  taken  with  or  without  LVT,  it 
does  not  produce  results  as  good  as  using  LVT  alone.  The  round-robin  scheduler 
was  one  of  the  worst  performers.  The  poor  performance  of  the  simple  round- 
robin  scheduler  indicates  the  advantages  of  using  state  information  to  determine 
the  scheduling  order  in  the  Time  Warp  system. 

Benchmarks  of  the  schedulers  showed  that  the  Anti-Penalty  scheduler  per¬ 
formed  better  than  the  others.  The  Anti-Penalty  algorithm  is  based  on  a  com¬ 
posite  measure  of  simulation  advance  rate,  flow  control,  and  the  appearance  of 
specific  message  types.  The  benchmark  simulation  executed  on  a  five  processor 
Time  Warp  system. 
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1  Introduction 

The  Time  Warp  multiprocessing  scheme  promises  speedup  for  object-oriented 
discrete-event  simulations.  RAND’s  Concurrent  Processing  for  Advanced  Simu¬ 
lation  project  has  constructed  a  Lisp-based  Time  Warp  system  for  i.nplementing 
simulations  with  many  large  complex  objects.  Time  Warp  is  an  optimistic  strat¬ 
egy  for  distributed  simulation  that  distributes  objects  and  workload  across  nodes 
in  a  network  of  workstations.  This  paper  describes  a  study  of  the  scheduling 
algorithm  used  to  share  the  same  processing  unit  among  a  number  of  objects' . 

To  minimize  storage  fragmentation,  our  multiprocessing  system  operates  as 
a  single  job  per  processor.  The  system  allocates  dynamic  storage  from  a  single 
heap  shared  by  a  processor’s  resident  objects.  The  task  scheduler  allows  each 
object  to  execute  for  a  a  number  of  simulation  event  cycles  before  switching 
contexts  to  another.  Our  experiments  vary  the  number  of  cycles,  the  frequency 
an  object  is  selected  for  execution,  and  the  order  of  object  selection.  Our  goal 
was  to  identify  parameters  that  play  a  major  role  in  Time  Warp  scheduling  and 
to  test  various  mechanisms  that  improved  its  performance. 

After  describing  the  general  class  of  Time  Warp  systems,  we  review  the 
different  classes  of  distributed  schedulers  and  argue  for  the  existence  of  a  new 
class.  We  then  describe  the  different  variants  of  schedulers  we  tested,  the  results, 
and  then  close  with  a  discussion  of  the  costs  and  benefits  of  using  different 
features  of  the  state  to  improve  performance. 

2  The  Time  Warp  Mechanism 

Time  Warp  is  an  object-oriented  message-passing  scheme  for  transparent  multi¬ 
processing  [4,  6].  It  is  particularly  well  suited  for  object-oriented  discrete-event 
simulations  with  large  numbers  of  objects  having  complex  behaviors.  Objects 
require  a  large  amount  of  code  to  define  their  behavior.  Time  Warp  differs 
from  single  processor  discrete-event  simulations  in  three  key  ways.  First,  while 
conventional  discrete-event  simulations  have  a  single  simulation  clock  serving 
all  objects,  each  Time  Warp  object  has  its  own  clock.  Second,  in  a  conven¬ 
tional  discrete-event  simulation,  all  objects  execute  with  the  same  simulation 
time,  whereas  in  a  Time  Warp  simulation,  each  object  proceeds  at  its  own  rate. 
Third,  conventional  discrete-event  simulations  require  state  saving  only  for  ex¬ 
cursions  (where  an  object  would  predict  the  future  by  simulating  it).  In  a  Time 
Warp  simulation,  on  the  other  hand,  frequent  state  saving  is  required,  in  order 
that  objects  computing  with  incorrect  values  can  restart  after  a  rollback. 

Time  Warp  objects  communicate  by  sending  messages.  All  messages  have 
time  stamps  indicating  both  the  sending  and  receiving  time.  If  a  message  arrives 
in  an  object's  future,  the  system  puts  it  on  the  object's  input  queue  for  future 

’  riiis  is  an  expanded  version  of  an  article  which  originally  appeared  in  the  April  1990  issue 
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processing.  Messages  arriving  in  an  object’s  past  cause  the  object  to  roll  back 
to  a  time  prior  to  the  message.  Messages  that  have  already  been  sent  are  can¬ 
celled  by  sending  a  corresponding  antt-message.  When  an  object’s  time  is  the 
same  as  a  message’s  time,  the  system  removes  the  message  from  the  input  queue 
(also  called  event  queue),  evaluates  the  message  text,  and  processes  any  output 
messages.  This  object  cycle  serves  as  a  basis  for  our  non-preernptive  scheduling 
algorithms.  This  contrasts  with  a  preemptive  scheme  that  might  suspend  an  ob¬ 
ject’s  execution  in  the  middle  of  a  cycle.  We  choose  non-preemptive  scheduling, 
because  it’s  difficult  to  move  objects  between  machines  in  the  middle  of  an  event 
cycle.  Likewise,  in  our  Lisp  programming  environment,  preemptive  algorithms 
require  complex  and  expensive  context  switches. 

Global  Virtual  Time  (GVT)  is  the  time  of  the  farthest  behind  local  clock  for 
all  simulation  objects.  This  time  is  equivalent  to  the  real  time  of  the  simulation. 
All  input  and  output  is  synchronized  with  GVT.  Since  an  object  cannot  serin  a 
message  into  its  past,  objects  can  never  roll  back  to  a  time  earlier  than  GVT.  All 
messages  and  states  saved  prior  to  GVT  are  subject  to  fossil  rolleciion.  which 
frees  their  storage  space.  The  computation  of  GVT  occurs  at  periodic  intervals. 

Since  our  workstations  are  quite  large  (8  -  24  megabyte  workstations),  each 
can  service  many  objects.  A  Time  Warp  scheduling  routine  running  on  each 
processor  optimizes  various  system  parameters  to  reduce  the  global  program 
execution  time.  Because  we  wanted  to  scale  the  system  up,  we  did  not  allow 
interprocessor  communications  between  scheduling  algorithms.  Finally,  it  is 
extremely  difficult  to  compute  event  dependencies  from  program  source  code. 
Consequently,  our  algorithms  assume  no  known  scheduling  dependencies. 

3  A  New  Category  of  Distributed  Scheduler 

There  are  many  types  of  single  processor  schedulers:  deadline,  priority,  pre¬ 
emptive,  aon-preemptive,  etc.  I'wo  of  the  main  objectives  for  all  scheduling 
disciplines  are  to  maximize  throughput  and  reduce  overhead  [3].  Distributed 
schedulers  must  also  meet  the  same  requirements.  A  review  of  the  literature 
reveals  two  categories  of  distributed  schedulers. 

1.  Scheduling  of  independent  ta.sks 

2.  Scheduling  of  dependent  tasks 

Tannenbaum  [11]  states  that  the  first  type  can  be  scheduled  randomly,  and 
then  only  discusses  how  to  handle  the  scheduling  of  dependent  t2isks.  There  are 
also  numerous  studies  of  dependent  systems  [7,  8,  9,  lOj. 

Dependency  information  is  normally  not  part  of  the  simulation  code,  nor 
in  general  can  it  be  computed  automatically.  Consequently,  the  dependency- 
based  schemes  are  not  appropriate  for  our  purposes.  Time  Warp  processes  may 
be  scheduled  in  various  orderings  and  the  system  will  still  complete  execution. 
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Tlius,  Time  Warp  would  appear  to  fall  into  Tannebaum’s  first  category.  How¬ 
ever,  our  results  show  random  scheduling  to  be  extremely  poor.  Although  there 
are  dependencies  in  a  Time  Warp  simulation,  we  can’t  reliably  predict  them. 
Therefore,  our  schedulers  fall  into  a  third  category  that  rely  on  indirect  measures 
of  dependency  based  on  simulation  state. 

I  he  schedulers  discussed  in  this  paper  are  also  not  covered  by  Casavant  and 
Kuhrs[2]  taxonomy  of  distributed  .schedulers.  At  the  top  of  their  hierarchical 
structure  is  local  versus  global  scheduling.  They  describe  global  schedulers  as 
those  that  decide  whcie  a  task  should  execute.  Local  schedulers  run  concur¬ 
rently  (one  on  each  processor)  and  determine  the  scheduling  for  their  respective 
proces.sors  onlv  Since  our  schedulers  make  no  decision  of  where  a  task  should 
execute,  they  must  be  local.  Casavant  and  Kiihl  discuss  subcategories  of  global 
schedulers,  but  their  taxonomy  stops  at  this  point. 

.Nonetheless,  many  of  Casavant  and  Kuhl’s  subcate"ories  for  global  sched¬ 
ulers  apply  to  our  local  schedulers.  For  instance,  some  of  our  schedulers  fall 
into  their  dynamic  category,  because  they  u.se  run-time  inform.-ition  about  the 
ta.->k  to  determine  the  scheduling  order.  Tliey  are  noncooperatiie,  because  indi¬ 
vidual  processors  make  decisions  independent  of  the  actions  of  other  processors. 
Our  .schedulers  also  use  sub-optirnal  approximations  to  determine  which  task  to 
schedule  nf'xt.  For  example.  Time  Warp  information  such  as  Local  Virtual  Time 
(  L\  f)  and  input  queue  length  prove  useful  for  the  schedulers  but  are  clearly 
non-optimal.  We  wiL  show  that  the  best  schedulers  are  tho-e  that  are  local, 
non-cooperative,  and  use  sub-optimal  approximations.  We  will  also  rhow  ih.it 
a  well-chosen  scheduling  method  results  in  improved  Time  Warp  performance. 

4  Time  Warp  Scheduling  Algorithms 

Our  first  teisk  was  to  identify  system  characteristics  that  affect  total  system 
throughput:  a  simulation’s  start-to-finish  wall  clock  time.  We  instrumented 
the  Time  Warp  system  to  measure  various  parameters  that  determine  system 
performance.  These  include  the  lengths  of  Time  W'arp  message  queues,  stan¬ 
dard  deviation  of  LVT  across  all  objects,  CPU  time  per  object  cycle,  number 
of  rollbacks,  network  utilization,  and  many  others.  Starting  with  the  simplier 
schedulers,  we  examined  advance  rales  and  execution  statistics  to  make  infer- 
eiK  I  .“i  on  how  to  improve  the  ( JV'T  advance  rate.  Though  we  could  nc<t  examine 
all  cnmiutiat  ions  of  tlu  x’  system  parameters,  we  choose  important  combin.ations 
that  would  account  for  widely  varying  event  times  and  system  load.  We  tested 
the  fc,i||owing  ten  scheduling  algorithms. 

1  W  'orst  Case:  An  unlimited  Hound-Robin  scheduler  where  each  obj.'ct 
runs  to  completion  or  exhaustion  of  its  input  queue.  This  algorithm  works 
well  in  simulations  with  no  interaction  between  objects,  but  exhibits  worst- 
case  and  even  divergent  behavior  witli  dependent  objects  We  were  not 
aide  to  test  this  scheme  on  any  hut  the  most  trivial  simulations 
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2.  Round-Robin:  We  implemented  this  algorithm  as  a  control  in  the  '“x- 
periment.  This  is  essentially  the  random  scheduling  algorithm  in  Tannen- 
baum’s  taxonc.ny.  Each  object  executes  a  fixed  number  of  object-cycles 
before  relinquishing  control.  A  very  large  number  of  object-cycles  would 
make  the  Round-Robin  scheduler  degenerate  into  the  Worst  Case  sched¬ 
uler.  By  experiment,  we’ve  determined  that  three  is  the  best  number  of 
cycles. 

3.  Lowest  LVT  First:  This  algorithm  isolates  LVT  as  a  measure  for  schedul¬ 
ing.  The  idea  is  to  process  the  object  with  the  lowest  LVT  first  to  get  it 
closer  in  time  to  other  objects  in  the  system.  This  scheduler  is  essen¬ 
tially  the  Round-Robin  scheduler  in  which  the  order  of  object  execution 
is  changed,  but  the  number  of  cycles  per  object  remains  constant.  Each 
time  the  system  cycles  through  the  queue,  it  places  the  object  with  the 
lowest  LVT  at  the  head.  The  reasoning  behind  this  algorithm  is  as  follows: 
rollback  occurs  when  object  B  sends  a  message,  X,  to  object  A  with  a 
time  stamp  earlier  than  A’s  current  LVT.  Thus,  object  A  must  roll  back 
to  a  time  previous  to  X’s  time  stamp. 

4.  Priority  LVT;  Like  the  Lowest  LVT  First  algorithm,  this  scheduler  em¬ 
phasizes  objects  with  low  LVTs.  It  computes  the  standard  deviation  of  the 
LVTs  of  the  processor’s  objects.  It  then  places  all  objects  with  LVTs  less 
than  the  mean  LVT  plus  the  standard  deviation  onto  a  priority  scheduling 
queue.  Objects  on  the  priority  queue  get  more  event  cycles  than  objects 
on  the  standard  scheduling  queue.  The  idea  of  both  this  and  the  Lowest 
LVT  scheme  is  to  allow  objects  that  are  farthest  behind  to  catch  up  with 
the  others. 
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Largest  Queue  Priority:  This  algorithm  emphasizes  objects  with  large 
unprocessed  event  queues.  It  is  exactly  the  same  as  Priority  LVT  except 
that  it  computes  the  standard  deviation  of  unprocessed  event  queue  sizes. 

When  an  object’s  queue  length  is  greater  than  the  mean  plus  the  standard 
deviation,  it  goes  onto  the  priority  queue.  The  belief  is  that  a  large  event 

queue  predicts  a  large  processing  requirement.  Processing  more  of  the  ^ 

events  should  even  out  the  wide  disparity  in  LVTs,  resulting  in  improved 

performance. 

Queue  Cycle:  This  algorithm  emphasizes  queue  length  rather  than  LVT 
as  a  heuristic  for  measure  of  performance.  This  algorithm  simply  schedules 
the  object  with  the  largest  unprocessed  event  queue  next.  It  schedules  each 
object  for  three  cycles. 

Bradford/Fitch:  The  Bradford/Fitch  algorithm  attempts  to  emphasize 
objects  doing  useful  work  by  penalizing  objects  with  high  L\'’Ts.  For  each 
object,  it  computes  a  penalty  value.  Out  of  those  objects  with  the  lowest 
penalty,  it  picks  one  at  random  and  cycles  it  up  to  three  times.  After 
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eacti  object  cycle,  the  Brad  ford/ Fitch  algorithm  checks  if  any  rollbacks 
or  returned  messages  occurred  since  it  Icist  checked.  A  returned  message 
indicates  that  an  object’s  input  queue  is  full  and  is  thus  returning  any  new 
ones  to  the  sender.  Rollbacks  and  returned  messages  suggest  that  other 
objects  need  to  be  processed  for  the  system  to  continue  doing  useful  work. 
Thus,  if  any  are  detected,  the  scheduler  then  finds  the  object  with  the 
lowest  LVT  and  schedules  it;  otherwise,  the  same  object  is  rescheduled. 

8.  Anti-Penalty:  This  algorithm  is  a  variation  on  the  Bradford/Fitch  sched¬ 
uler  that  attempts  to  stop  the  cycling  for  anti-messages  and  anti-reminders. 
Anti-reminders  are  generated  by  the  priority  request  mechanism  [5].  Like 
the  Bradford/Fitch  scheduler,  the  penalty  function  finds  the  object  with 
the  lowest  LVT  to  schedule.  However,  it  continues  scheduling  that  object 
for  up  to  three  cycles  or  until  a  rollback,  returned  message,  anti-message, 
or  anti-reminder  shows  up.  Bradford/Fitch  only  considers  returned  mes¬ 
sages  and  rollbacks. 

9  C<ueue  Anti-Penalty:  This  variant  of  the  Anti-Penalty  and  Bradford/Fitch 
algorithm  uses  unprocessed  event  queue  length,  rather  than  lowest  LVT, 
as  the  penalty  function. 

10.  Positive  Infinity:  This  algorithm  puts  objects  that  have  reached  positive 
infinity,  meaning  they  ho.ve  nothing  more  to  do,  on  a  low  priority  queue. 

It  schedules  objects  below  positive  infinity  three  times  more  often  than 
those  at  positive  infinity,  because  objects  at  positive  infinity  have  finished 
working  and  are  just  waiting  for  the  system  to  complete  or  to  roll  back. 
This  scheduler  focuses  on  the  termination  condition  in  Time  Warp,  while 
ignoring  the  interval  previous  to  termination. 

I'here  are  a  large  number  of  system  parameters  that  are  potential  measures 
of  system  performance.  We  selected  ones  that  seemed  reasonable  based  on 
our  understanding  and  experience  with  the  Time  Warp  system.  Attempting  to 
cover  all  possible  combinations  of  these  measurements  is  beyond  the  scope  of  this 
paper.  However,  we  have  tested  a  reasonably  wide  range  and  have  shown  that 
certain  combined  measurements  are  better  indicators  of  system  performance 
than  others. 


5  A  Test  Simulation 

Because  of  t  ime  constraints,  we  picked  a  single  worst  case  Time  Warp  simulation 
in  which  all  objects  communicate  with  all  other  objects  at  every  simulation  time 
step.  We  maintained  constant  parameters  for  the  execution  of  the  simulation 
varying  only  (irocessor  load  and  scheduling  algorithms. 

The  simulation  consists  of  a  number  of  moving  objects  which  have  pre- 
a.ssigned  velocities.  These  move  about  on  a  grid  and  change  direction  when 


they  bump  into  the  surrounding  walls  or  when  two  or  more  land  on  the  same 
grid.  A  graphics  object  displays  moving  object  location  on  a  regular  basis,  and 
a  console  object  prints  the  numeric  location  and  direction  changes.  All  objects 
were  programmed  to  accept  information  in  a  non-deterministic  order.  However, 
with  the  exception  of  the  con.sole  display,  the  simulation  is  still  deterministic. 

The  simulation  was  run  on  a  network  of  four  SUN-4  workstations  connected 
by  a  ten  megabit  local  area  network.  Each  processor  has  its  own  local  disk  to 
avoid  paging  over  the  network.  The  network  was  run  without  any  load  other 
than  that  created  by  Time  Warp.  The  controlling  workstation  is  a  10  MIP 
processor  with  24  megabytes  of  main  memory  running  the  12  megabyte  GVT 
and  graphics  process.  The  three  Time  Warp  processors  are  eight  megabyte 
Sr.N-4's  running  a  12  megabyte  process  with  three  objects  per  machine.  There 
IS  no  dynamic  load  balancing  (dynamic  load  balancing  is  the  subject  of  another 
paper  [Ij. 

Measured  wall-clock  time  does  not  include  Time  Warp  system  initialization. 
However,  the  times  are  effected  by  the  size  of  the  Time  Warp  GV'T  interval  as 
the  last  G\'r  cycle  may  have  few  events.  With  a  GVT  cycle  time  of  five  seconds 
and  a  minimum  simulation  execution  time  of  100  seconds,  there  is  an  error  of 
only  five  percent. 

6  Experimental  Results 

We  tested  tlie  ten  algorithms  described  in  tlie  previous  section.  The  worst- 
case  scheduler  is  more  than  O(n^)  where  n  is  the  number  of  messages  between 
objects.  Only  in  extremely  simple  cases  did  it  ever  complete  and  is  therefore 
exclmled  from  tlie  figures. 

I'o  compare  the  performance  of  the  algorithms,  the  test  simulation  was  run 
with  each  16  times.  As  we  are  running  a  public  network  of  workstations,  network 
delays  and  uncontrolled  processor  loads  cause  a  variance  in  execution  time.  A 
good  algorithm  will  also  be  responsive  to  these  independent  changes.  The  his¬ 
tograms  in  Figures  1-9  show  the  distribution  of  e.xecution  times  for  the  different 
schedulers.  For  example.  Figure  1  shows  that  the  Lowest  LVT  First  scheduler 
executed  nine  times  at  110  seconds,  six  limes  at  120  seconds,  and  one  time  at 
130  seconds. 

1  he  Anti-Penalty  scheduler  produced  the  best  performance,  followed  by  a  tie 
between  the  Queue  .Anti-Penalty  scheduler  and  the  Hradford/Filch  scheduler. 
The  results  show  the  correctness  of  the  assumptions  made  by  the  Bradford/f’itch 
scheduler:  scheduling  the  object  with  the  lowest  LV'T  next  and  blocking  when 
it  can  make  no  more  progress  reduces  simulation  wall-clock  time. 

The  only  difference  between  the  Anti-Penalty  scheduler  and  the  Bradford/fdteh 
scheduler  is  when  the  scheduler  blocks  I  he  currently  I'xecuting  object .  The  Hrad- 
ford/Fitch  scheduler  blocks  as  soon  as  a  rollback  or  a  returned  message  is  sent, 
because  any  more  work  done  by  the  currently  executing  object  at  this  point  will 
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he  wasted.  The  Anti-Penalty  scheduler  goes  one  step  furtlier.  It  also  blocks  an 
object  if  anti-reminders  or  anti-messages  are  sent.  Anti-niessages  indicate  that 
an  object  will  be  rolled  back,  so  it  is  best  to  block  the  current  object  so  that  the 
object  to  roll  b,ick  can  do  so  immediately,  because  any  messages  it  gets  after  the 
ant i-mes.sage  are  likely  to  be  undone.  It  also  considers  anti-reminders,  because 
the  priority  request  mechanism  can  cause  an  object  to  roll  back.  The  results 
show  that  blocking  on  anti-ntessages  and  anti-reminders  in  addition  to  rollback 
or  returned  messages  enhances  the  performance  of  the  scheduling  algorithm. 

These  results  show  that  LVT,  anti-messages,  rollbacks,  returned  messages, 
and  anti-reminders  are  the  best  indicators  of  system  load.  Input  queue  size  is 
also  an  important  factor,  but  when  taken  with  or  without  LVT,  it  does  not 
produce  results  as  good  as  using  LV'T  alone.  The  Round-Robin  scheduler  was 
one  of  the  worst  performers.  The  poor  performance  of  the  simple  Round-Robin 
scheduler  indicates  the  advantages  of  using  state  information  to  determine  the 
.scheduling  order  in  the  Time  Warp  system. 

Figure  10  ranks  scheduler  performance,  with  the  best  at  the  bottom.  For 
each  scheduler,  there  is  a  bar  indicating  the  range  of  times  it  took  to  execute, 
rite  shaded  portion  of  the  bar  indicates  times  below  the  mean,  and  the  unshaded 
portion  of  the  bar  illustrates  the  times  above  the  mean.  The  mean  time  for  each 
scheduler  is  the  dividing  line  between  the  shaded  and  unshaded  regiotis. 

Frotii  Figitre  10,  we  can  see  that  there  were  three  groups  of  schedulers:  those 
whose  titnt's  ranged  from  100  to  1 10,  from  100  to  130,  and  from  100  to  200.  The 
three  schedulers  in  the  fastest  group  all  depend  on  aspects  of  the  object  state 
information.  This  shows  the  dramatic  difference  the  use  of  Time  Warp  state 
information  in  scheduling  decisions  can  make  on  system  performance. 


7  Conclusions 

\\V  tested  ten  different  algorithms  for  the  Time  Warp  system.  Through  timing 
cotnparisons.  w-e  have  shown  that  the  best  performance  is  achieved  by  examin¬ 
ing  information  about  the  object's  state  to  determine  which  object  to  schedule 
next.  I'hose  algorithms  fall  into  a  new  category  for  distributed  system  sched¬ 
ulers.  The  previous  categories  consisted  of  schedulers  for  independent  processes 
and  schedulers  for  processes  with  nested  dependencies.  The  schedulers  have  at¬ 
tributes  of  Casavant  and  Kuhl’s  taxonomy:  they  are  local,  noncooperative,  and 
u.se  sub-optimal  approximations,  but  they  don’t  firmly  fit  into  that  taxonomy 
either  A  category  that  applies  to  these  schedulers  would  cover  local  schedulers 
of  independent  processes  that  benefit  from  the  use  of  process  state  information 
iti  the  scheduling  decision. 

Benchmarks  of  the  schedulers  showed  that  the  Anti-Penalty  scheduler  per¬ 
formed  better  than  the  others.  The  Anti-Penalty  scheduler  selects  the  object 
with  the  lowest  LVT  to  schedule.  It  continues  scheduling  it  for  up  to  three  cycles 
or  until  a  rollback,  returned  message,  anti-mes.sage,  or  anti-reminder  shows  up. 
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The  benchmarks  were  executed  under  a  simulation  executed  on  a  five-processor 
Time  Warp  system.  The  benchmarks  also  showed  that  LVT,  anti-messages,  roll¬ 
backs,  returned  messages,  and  anti-reminders  are  the  best  indicators  of  system 
load  on  Time  Warp. 

The  dynamic  load  balancing  techniques  that  we  have  tested  use  the  same 
inputs  for  measuring  processor  load  under  Time  Warp  as  do  the  scheduling 
algorithms.  While  a  description  of  load  balancing  is  left  to  another  paper,  it  is 
important  to  point  out  that  the  same  parameters  for  measuring  processor  load 
apply  to  both  scheduling  and  load  balancing. 
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Figure  '.i:  Queue  Anti-Penalty  Scheduler 
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Figure  9:  Largest  Queue  Priority  Scheduler 
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Figure  10:  Summary  of  Schedulers 


